Overview

Dataset statistics

Number of variables9
Number of observations767
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory54.1 KiB
Average record size in memory72.2 B

Variable types

NUM8
BOOL1

Warnings

Nº de gravidez has 111 (14.5%) zeros Zeros
Pressão sanguínea has 35 (4.6%) zeros Zeros
Trícepis has 227 (29.6%) zeros Zeros
Insulina has 373 (48.6%) zeros Zeros
IMC has 11 (1.4%) zeros Zeros

Reproduction

Analysis started2022-02-13 22:01:17.954604
Analysis finished2022-02-13 22:01:38.466283
Duration20.51 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Nº de gravidez
Real number (ℝ≥0)

ZEROS

Distinct17
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.842242503
Minimum0
Maximum17
Zeros111
Zeros (%)14.5%
Memory size6.0 KiB
2022-02-13T19:01:38.591202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile10
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.370876524
Coefficient of variation (CV)0.8773200862
Kurtosis0.1612926278
Mean3.842242503
Median Absolute Deviation (MAD)2
Skewness0.9039762644
Sum2947
Variance11.36280854
MonotocityNot monotonic
2022-02-13T19:01:38.713572image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%) 
113517.6%
 
011114.5%
 
210313.4%
 
3759.8%
 
4688.9%
 
5577.4%
 
6496.4%
 
7455.9%
 
8385.0%
 
9283.7%
 
Other values (7)587.6%
 
ValueCountFrequency (%) 
011114.5%
 
113517.6%
 
210313.4%
 
3759.8%
 
4688.9%
 
ValueCountFrequency (%) 
1710.1%
 
1510.1%
 
1420.3%
 
13101.3%
 
1291.2%
 

Glicose
Real number (ℝ≥0)

Distinct136
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120.8591917
Minimum0
Maximum199
Zeros5
Zeros (%)0.7%
Memory size6.0 KiB
2022-02-13T19:01:39.026499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile79
Q199
median117
Q3140
95-th percentile181
Maximum199
Range199
Interquartile range (IQR)41

Descriptive statistics

Standard deviation31.97846846
Coefficient of variation (CV)0.2645927713
Kurtosis0.6429918315
Mean120.8591917
Median Absolute Deviation (MAD)20
Skewness0.1764123623
Sum92699
Variance1022.622445
MonotocityNot monotonic
2022-02-13T19:01:39.242272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
99172.2%
 
100172.2%
 
129141.8%
 
125141.8%
 
106141.8%
 
111141.8%
 
102131.7%
 
95131.7%
 
105131.7%
 
108131.7%
 
Other values (126)62581.5%
 
ValueCountFrequency (%) 
050.7%
 
4410.1%
 
5610.1%
 
5720.3%
 
6110.1%
 
ValueCountFrequency (%) 
19910.1%
 
19810.1%
 
19740.5%
 
19630.4%
 
19520.3%
 

Pressão sanguínea
Real number (ℝ≥0)

ZEROS

Distinct47
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.10169492
Minimum0
Maximum122
Zeros35
Zeros (%)4.6%
Memory size6.0 KiB
2022-02-13T19:01:39.640418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.6
Q162
median72
Q380
95-th percentile90
Maximum122
Range122
Interquartile range (IQR)18

Descriptive statistics

Standard deviation19.36815466
Coefficient of variation (CV)0.2802847988
Kurtosis5.168577977
Mean69.10169492
Median Absolute Deviation (MAD)8
Skewness-1.841911017
Sum53001
Variance375.1254149
MonotocityNot monotonic
2022-02-13T19:01:39.882531image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%) 
70577.4%
 
74526.8%
 
78455.9%
 
68455.9%
 
72435.6%
 
64435.6%
 
80405.2%
 
76395.1%
 
60374.8%
 
0354.6%
 
Other values (37)33143.2%
 
ValueCountFrequency (%) 
0354.6%
 
2410.1%
 
3020.3%
 
3810.1%
 
4010.1%
 
ValueCountFrequency (%) 
12210.1%
 
11410.1%
 
11030.4%
 
10820.3%
 
10630.4%
 

Trícepis
Real number (ℝ≥0)

ZEROS

Distinct51
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.51760104
Minimum0
Maximum99
Zeros227
Zeros (%)29.6%
Memory size6.0 KiB
2022-02-13T19:01:40.164604image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median23
Q332
95-th percentile44
Maximum99
Range99
Interquartile range (IQR)32

Descriptive statistics

Standard deviation15.95405906
Coefficient of variation (CV)0.7775791637
Kurtosis-0.5183252996
Mean20.51760104
Median Absolute Deviation (MAD)12
Skewness0.1120576816
Sum15737
Variance254.5320005
MonotocityNot monotonic
2022-02-13T19:01:40.560322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
022729.6%
 
32314.0%
 
30273.5%
 
27233.0%
 
23222.9%
 
18202.6%
 
28202.6%
 
33202.6%
 
31192.5%
 
19182.3%
 
Other values (41)34044.3%
 
ValueCountFrequency (%) 
022729.6%
 
720.3%
 
820.3%
 
1050.7%
 
1160.8%
 
ValueCountFrequency (%) 
9910.1%
 
6310.1%
 
6010.1%
 
5610.1%
 
5420.3%
 

Insulina
Real number (ℝ≥0)

ZEROS

Distinct186
Distinct (%)24.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.90352021
Minimum0
Maximum846
Zeros373
Zeros (%)48.6%
Memory size6.0 KiB
2022-02-13T19:01:40.818223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median32
Q3127.5
95-th percentile293
Maximum846
Range846
Interquartile range (IQR)127.5

Descriptive statistics

Standard deviation115.2831052
Coefficient of variation (CV)1.442778802
Kurtosis7.205266456
Mean79.90352021
Median Absolute Deviation (MAD)32
Skewness2.270630168
Sum61286
Variance13290.19433
MonotocityNot monotonic
2022-02-13T19:01:41.080030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
037348.6%
 
105111.4%
 
14091.2%
 
13091.2%
 
12081.0%
 
10070.9%
 
18070.9%
 
9470.9%
 
11560.8%
 
13560.8%
 
Other values (176)32442.2%
 
ValueCountFrequency (%) 
037348.6%
 
1410.1%
 
1510.1%
 
1610.1%
 
1820.3%
 
ValueCountFrequency (%) 
84610.1%
 
74410.1%
 
68010.1%
 
60010.1%
 
57910.1%
 

IMC
Real number (ℝ≥0)

ZEROS

Distinct248
Distinct (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.9904824
Minimum0
Maximum67.1
Zeros11
Zeros (%)1.4%
Memory size6.0 KiB
2022-02-13T19:01:41.361922image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21.8
Q127.3
median32
Q336.6
95-th percentile44.41
Maximum67.1
Range67.1
Interquartile range (IQR)9.3

Descriptive statistics

Standard deviation7.889090901
Coefficient of variation (CV)0.2466074379
Kurtosis3.282498397
Mean31.9904824
Median Absolute Deviation (MAD)4.6
Skewness-0.4279502476
Sum24536.7
Variance62.23775525
MonotocityNot monotonic
2022-02-13T19:01:41.627964image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
32131.7%
 
31.6121.6%
 
31.2121.6%
 
0111.4%
 
33.3101.3%
 
32.4101.3%
 
32.891.2%
 
30.891.2%
 
30.191.2%
 
32.991.2%
 
Other values (238)66386.4%
 
ValueCountFrequency (%) 
0111.4%
 
18.230.4%
 
18.410.1%
 
19.110.1%
 
19.310.1%
 
ValueCountFrequency (%) 
67.110.1%
 
59.410.1%
 
57.310.1%
 
5510.1%
 
53.210.1%
 

Chance de diabetes hereditária
Real number (ℝ≥0)

Distinct516
Distinct (%)67.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4716740548
Minimum0.078
Maximum2.42
Zeros0
Zeros (%)0.0%
Memory size6.0 KiB
2022-02-13T19:01:42.096887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.078
5-th percentile0.1403
Q10.2435
median0.371
Q30.625
95-th percentile1.1333
Maximum2.42
Range2.342
Interquartile range (IQR)0.3815

Descriptive statistics

Standard deviation0.3314973556
Coefficient of variation (CV)0.7028102399
Kurtosis5.593373853
Mean0.4716740548
Median Absolute Deviation (MAD)0.166
Skewness1.921190451
Sum361.774
Variance0.1098904968
MonotocityNot monotonic
2022-02-13T19:01:42.583469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.25860.8%
 
0.25460.8%
 
0.23850.7%
 
0.26150.7%
 
0.20750.7%
 
0.26850.7%
 
0.25950.7%
 
0.24540.5%
 
0.29940.5%
 
0.68740.5%
 
Other values (506)71893.6%
 
ValueCountFrequency (%) 
0.07810.1%
 
0.08410.1%
 
0.08520.3%
 
0.08820.3%
 
0.08910.1%
 
ValueCountFrequency (%) 
2.4210.1%
 
2.32910.1%
 
2.28810.1%
 
2.13710.1%
 
1.89310.1%
 

Idade
Real number (ℝ≥0)

Distinct52
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.2190352
Minimum21
Maximum81
Zeros0
Zeros (%)0.0%
Memory size6.0 KiB
2022-02-13T19:01:43.114397image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile21
Q124
median29
Q341
95-th percentile58
Maximum81
Range60
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.7522956
Coefficient of variation (CV)0.3537819665
Kurtosis0.6608718486
Mean33.2190352
Median Absolute Deviation (MAD)7
Skewness1.135164695
Sum25479
Variance138.1164518
MonotocityNot monotonic
2022-02-13T19:01:43.499196image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
22729.4%
 
21638.2%
 
25486.3%
 
24466.0%
 
23385.0%
 
28354.6%
 
26334.3%
 
27324.2%
 
29293.8%
 
31243.1%
 
Other values (42)34745.2%
 
ValueCountFrequency (%) 
21638.2%
 
22729.4%
 
23385.0%
 
24466.0%
 
25486.3%
 
ValueCountFrequency (%) 
8110.1%
 
7210.1%
 
7010.1%
 
6920.3%
 
6810.1%
 
Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size6.0 KiB
0
500 
1
267 
ValueCountFrequency (%) 
050065.2%
 
126734.8%
 
2022-02-13T19:01:43.831908image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Interactions

2022-02-13T19:01:25.082881image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:25.315527image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:25.507521image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:25.780086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:26.010792image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:26.154698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:26.485064image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:26.842869image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:27.079508image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:27.353110image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:27.611568image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:27.784944image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:28.137086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:28.471393image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:28.817267image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:28.981995image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:29.141827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:29.295297image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:29.598715image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:29.769814image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:30.095230image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:30.397686image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:30.587661image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:30.736894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:30.883816image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:31.089969image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:31.378832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:31.524646image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:31.714839image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:31.982972image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:32.201614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:32.363478image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:32.504360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:32.652902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:32.796956image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:32.931765image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:33.286689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:33.410259image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:33.687723image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:33.847136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:33.981812image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:34.136374image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:34.303763image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:34.452783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:34.595176image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:34.743251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:34.893957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:35.039592image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:35.188545image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:35.341715image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:35.538784image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:35.714246image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:35.856253image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:35.990820image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:36.134182image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:36.285207image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:36.434131image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:36.581220image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:36.735817image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:36.885100image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:37.025271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:37.180017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:37.335707image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:37.537808image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2022-02-13T19:01:43.928075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-02-13T19:01:44.416115image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-02-13T19:01:44.663890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-02-13T19:01:44.916150image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-02-13T19:01:37.931557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2022-02-13T19:01:38.282716image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Sample

First rows

Nº de gravidezGlicosePressão sanguíneaTrícepisInsulinaIMCChance de diabetes hereditáriaIdadeVive com diabetes
01856629026.60.351310
18183640023.30.672321
218966239428.10.167210
30137403516843.12.288331
45116740025.60.201300
537850328831.00.248261
61011500035.30.134290
72197704554330.50.158531
8812596000.00.232541
94110920037.60.191300

Last rows

Nº de gravidezGlicosePressão sanguíneaTrícepisInsulinaIMCChance de diabetes hereditáriaIdadeVive com diabetes
7571106760037.50.197260
7586190920035.50.278661
75928858261628.40.766220
76091707431044.00.403431
761989620022.50.142330
76210101764818032.90.171630
76321227027036.80.340270
7645121722311226.20.245300
7651126600030.10.349471
7661937031030.40.315230